Current Issue : October-December Volume : 2025 Issue Number : 4 Articles : 5 Articles
The elastic, mechanical, acoustic, and thermal properties of Ti3SiC2, Ti3IrC2, and Ti3AuC2 MAX phases were systematically investigated using first-principles calculations based on density functional theory. The computed lattice parameters and elastic, mechanical, and acoustic properties were consistent with existing experimental and theoretical findings, confirming the intrinsic mechanical stability of these MAX phases. Single-crystal elastic stiffness constants were used to derive polycrystalline elastic moduli, directional dependencies of bulk, shear, and Young’s moduli, and anisotropic factors. The results revealed a ductility sequence of Ti3SiC2 < Ti3IrC2 < Ti3AuC2, with Ti3IrC2 and Ti3AuC2 exhibiting greater elastic anisotropy than Ti3SiC2. Additionally, sound velocities, Debye temperatures, minimum thermal conductivities, melting points, and Grüneisen parameters were determined. The findings showed that Ti3SiC2 outperforms Ti3IrC2 and Ti3AuC2 in sound velocity, average sound velocity, Debye temperature, and minimum thermal conductivity, while Ti3IrC2 has the highest melting point and Ti3AuC2 the largest Grüneisen parameter. These results provide valuable insights into the design of related materials for high-performance applications....
A model of the vocal tract that mimicked velopharyngeal insufficiency was created, and acoustic analysis was performed using the boundary element method to clarify the acoustic characteristics of velopharyngeal insufficiency. The participants were six healthy adults. Computed tomography (CT) images were taken from the frontal sinus to the glottis during phonation of the Japanese vowels /i/ and /u/, and models of the vocal tracts were created from the CT data. To recreate velopharyngeal insufficiency, coupling of the nasopharynx was carried out in vocal tract models with no nasopharyngeal coupling, and the coupling site was enlarged in models with nasopharyngeal coupling. The vocal tract models were extended virtually for 12 cm in a cylindrical shape to represent the region from the lower part of the glottis to the tracheal bifurcation. The Kirchhoff–Helmholtz integral equation was used for the wave equation, and the boundary element method was used for discretization. Frequency response curves from 1 to 3000 Hz were calculated by applying the boundary element method. The curves showed the appearance of a pole–zero pair around 500 Hz, increased intensity around 250 Hz, decreased intensity around 500 Hz, decreased intensities of the first and second formants (F1 and F2), and a lower frequency of F2. Of these findings, increased intensity around 250 Hz, decreased intensity around 500 Hz, decreased intensities of F1 and F2, and lower frequency of F2 agree with the previously reported acoustic characteristics of hypernasality....
Deepfake technology uses artificial intelligence to create highly realistic but fake audio, video, or images, often making it difficult to distinguish from real content. Due to its potential use for misinformation, fraud, and identity theft, deepfake technology has gained a bad reputation in the digital world. Recently, many works have reported on the detection of deepfake videos/images. However, few studies have concentrated on developing robust deepfake voice detection systems. Among most existing studies in this field, a deepfake voice detection system commonly requires a large amount of training data and a robust backbone to detect real and logistic attack audio. For acoustic feature extractions, Melfrequency Filter Bank (MFB)-based approaches are more suitable for extracting speech signals than applying the raw spectrum as input. Recurrent Neural Networks (RNNs) have been successfully applied to Natural Language Processing (NLP), but these backbones suffer from gradient vanishing or explosion while processing long-term sequences. In addition, the cross-dataset evaluation of most deepfake voice recognition systems has weak performance, leading to a system robustness issue. To address these issues, we propose an acoustic feature-fusion method to combine Mel-spectrum and pitch representation based on cross-attention mechanisms. Then, we combine a Transformer encoder with a convolutional neural network block to extract global and local features as a front end. Finally, we connect the back end with one linear layer for classification. We summarized several deepfake voice detectors’ performances on the silence-segment processed ASVspoof 2019 dataset. Our proposed method can achieve an Equal Error Rate (EER) of 26.41%, while most of the existing methods result in EER higher than 30%. We also tested our proposed method on the ASVspoof 2021 dataset, and found that it can achieve an EER as low as 28.52%, while the EER values for existing methods are all higher than 28.9%....
The geometric distribution of seabed beacons significantly impacts the positioning accuracy of underwater acoustic navigation systems. To address this challenge, we propose a depth-constrained adaptive stochastic model optimization method based on singular value decomposition (SVD). The method quantifies the contribution weights of each beacon to the dominant navigation direction by performing SVD on the acoustic observation matrix. The acoustic ranging covariance matrix can be dynamically adjusted based on these weights to suppress error propagation. At the same time, the prior depth with centimeterlevel accuracy provided by the pressure sensor is used to establish strong constraints in the vertical direction. The experimental results demonstrate that the depth-constrained adaptive stochastic model optimization method reduces three-dimensional RMS errors by 66.65% (300 m depth) and 77.25% (2000 m depth) compared to conventional equal-weight models. Notably, the depth constraint alone achieves 95% vertical error suppression, while combined SVD optimization further enhances horizontal accuracy by 34.2–53.5%. These findings validate that coupling depth constraints with stochastic optimization effectively improves navigation accuracy in complex underwater environments....
Machine learning (ML) robustness for voice disorder detection was evaluated using reverberation-augmented recordings. Common vocal health assessment voice features from steady vowel samples (135 pathological, 49 controls) were used to train/test six ML classifiers. Detection performance was evaluated under low-reverb and simulated medium (med = 0.48 s) and high-reverb times (high = 1.82 s). All models’ performance declined with longer reverberation. Support Vector Machine exhibited slight robustness but faced performance challenges. Random Forest and Gradient Boosting, though strong under low reverb, lacked generalizability in med/high reverb. Training/testing ML on augmented data is essential to enhance their reliability in real-world voice assessments....
Loading....